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Abstract—The purpose of this study is to provide a bibliometric 
overview of the detection of XSS attacks using latest cutting-edge 
technologies such as artificial intelligence, machine learning, big 
data, etc. Scopus databases were searched for articles published 
in English between 2009 and 2022 to discover current trends and 
concerns about XSS attack detection. A total of 184 empirical 
and non-empirical studies were compiled as a result of the 
evaluation process. This study used qualitative computer-assisted 
data analysis techniques. During the study period, the number of 
articles published in scientific journals increased exponentially, 
indicating that the research topic is still in the development 
phase. The most productive and relevant journals, nations, and 
authors are listed using bibliometric performance metrics. It also 
highlights the most important research trends, allowing numerous 
new research lines to be proposed via visual mapping of Thematic 
Maps. This study makes an important contribution to the 
field of sustainable entrepreneurship, providing a comprehensive 
overview of the field’s evolution and current status, as well as 
a comprehensive, synthesized, and organized summary of the 
various perspectives, definitions, and trends in the field. 


Index Terms—XSS attack, artificial intelligence, machine learn- 
ing, big data, blockchain. 


I. INTRODUCTION 


More and more programs and services can be found online 
for the convenience of the end user [1], [2]. However, these 
new services and applications have a number of security 
flaws that can be exploited [3], [4], [5], [6]. Organizations 
might face serious consequences if cyber thieves exploit these 
vulnerabilities, which are attractive to digital crooks [7], [8]. 
The majority of attacks on the Internet are caused by security 
flaws in its application architecture [9], such as incorrect 
input validations, insufficient security controls, etc. Cross-Site 
Scripting (XSS) [10], [11], [12], [13], [6], [14], [15] is the 
most widely exploited vulnerability on the Internet other then 
DDoS attack [16], [17], [18], [19], [20], [21], [22], [23], [24], 
[25] and phishing attack [26], [27]. It’s possible for an attacker 
to introduce malicious code into a legitimate web application 
using an XSS vulnerability. An attacker might exploit an input 
vulnerability in a web application to spread malicious code by 
exploiting XSS [28], [29], [30], [31], [32], [33], [34], [35]. 
More serious assaults like as phishing, keylogging, cookie 


Akshat Gaurav, Ronin __ Institute, Montclair, USA, Email: 
akshat.gaurav @ieee.org 
Domenico Santaniello, University of Salerno, Italy Email: dsan- 


taniello @unisa.it 
Avadhesh Kumar Gupta Unitedworld School of Computational Intelligence, 
Karnavati University (Gujarat)- INDIA Email: dr.avadheshgupta@ gmail.com 
Francesco COLACE, University of Salerno, Italy Email: fcolace @unisa.it 


stealing, and the like may be carried out on the network as a 
result of these intrusions. XSS attacks may be used in a variety 
of ways to get access to the private information of legitimate 
users. There are three primary types of cyberattacks [36], [37]. 
That is, reflected XSS, stored XSS, and DOM-based XSS are 
examples. In comparison to reflected XSS, the stored XSS 
flaws are more difficult to identify [38].Thus, researchers are 
working for the detection and identification of different types 
of cyber attacks [39], [40], [41], [42], [43], [44], [45], [45]. 
Currently, OSN (online social network)[46], [47], [48], [49], 
[50], [51], [52] is one of the most widely used internet 
services. Allows for communication and knowledge exchange 
between people. Yet in terms of safety, OSN has emerged as 
the preferred victim of cybercriminals and faces several risks, 
such as cross-site scripting (XSS) assaults [53], [54], [55]. In 
this context, authors in [56] offer a new method for detecting 
XSS in OSN [57], [58], [59], [60] that relies on machine 
learning [61], [62], [63], [64], [65]. Detecting XSS begins 
with a novel way for capturing characteristics from online 
pages and building classification models. To develop our site 
database, authors use a new way to mimic the propagation 
of XSS worms. To evaluate our categorization models, we 
conducted tests on our test database. It is clear from the 
results of the experiment that our method is an effective way 
to identify XSS attacks. Web applications are the most often 
targeted by cybercriminals, with the most common attack 
vector being Cross-Site Scripting (XSS). The primary strategy 
for preventing XSS harm at the source code level is a code 
audit. Manual audits and rule-based audit technologies, on the 
other hand, have a number of limitations. Machine learning 
[66] is a new study area in the era of big data that may help 
with manual auditing [67]. One of the most often occuring 
vulnerabilities is XSS, or cross-site scripting. XSS may have a 
wide range of effects, from minor to disastrous. However, XSS 
detection remains an outstanding problem. Previously, cross- 
site scripting was addressed with using both static and dynamic 
analysis. Because of the wide variety of XSS payloads, neither 
method is impenetrable. As a result of this research,the authors 
[68] have suggested the use of Genetic Algorithm (GA) 
[50], [69], [70] and Reinforcement Learning (RL) to combat 
XSS assaults. Real-world XSS assaults are used to test the 
suggested method’s performance. Our technique outperforms 
others previously published in the literature, according to 
the results. As a bonus, our solution is more adaptable to 
changes in XSS payloads, as well as more intelligible to end 
users. When the number of attacks increases, our strategy also 


improves. Detecting Web application assaults using machine 
learning approaches is becoming more prevalent and providing 
better results. Injection attacks such as cross-site scripting are 
common in online applications. Unknown XSS assaults can 
only be detected using machine learning [71], [72], [73], [74], 
[75], [76], [77], [78] approaches, which are more effective 
than current solutions like filter-based, dynamic analysis and 
static analysis. Machine learning algorithms used to identify 
XSS assaults include problems such as single base classifiers, 
limited datasets, and imbalanced datasets in existing research. 
A large labelled and balanced dataset was used to train super- 
vised ensemble learning algorithms to identify XSS assaults 
[79]. 

According to the following structure, the rest of the paper 
is laid out. Methodology and results are presented in Sections 
II and III. The paper comes to a close in Section IV. 


Il. METHODOLOGY 


A detailed literature study was conducted to determine the 
effect of cutting-edge technologies on XSS attack detection. 
The PRISMA review method was used to guide the review 
process. Systematic reviews are a different research strategy 
for the systematic and reproducible analysis and synthesis 
of current research materials. The following are the steps 
used to write this paper: The selection of the database, the 
modification of the research criteria, the coding of recovered 
material, and the evaluation of the information were all part 
of this process. 


A. Eligibility Criteria 


Research on the impact of cutting-edge technology on 
the detection of XSS attacks was included in the review. 
Publications published between November 2009 and 2022 in 
English were included in the map to show the current state of 
research around the world. 


B. Restrictions 


A limited number of publications were rejected from con- 
sideration because they did not fit the research focus. As an 
example, we do research only in computer science and not in 
any other discipline. 


C. Data Source 


Using the Scopus bibliographic database, the data was 
gathered in May 2022. The following two keywords were 
included in the search strategy to answer the research question. 


e XSS of cross-site scripting 
e AI or artificial intelligence 
e Machine learning 

e Blockchain 

e Big data 


D. Search Query Selection 


In order to obtain information from the Scopus database, 
we used the following query : 


TITLE-ABS-KEY ( ( xss OR ”Cross Site Scripting” OR 
*Cross-Site Scripting” ) AND ( ai OR “artificial intelligence” 
OR ’machine learning” OR deep learning” OR ”big data” 
OR *blockchian” ) ) 


E. Tools Used 


VOSviewer (v1.6) and R language were used to analyze the 
data. The tools provides visual representations of networks that 
connect nations, institutions, journals, authors, and keywords, 
making it easier to analyze and understand these connections. 
Science mapping research can be carried out using the R 
language, which is based on the science mapping analytic 
technique that allows longitudinal investigations. Another ben- 
efit of using this method is that it helps researchers discover 
connections and interactions between previously studied topics 
and new areas of study. 


III. RESULTS AND DISCUSSION 


Studies of an area’s bibliography may provide light on how 
the topic has developed and point to potential directions for 
future study. It provides a bird’s eye perspective of several 
facets of an area. This section is separated into two sub- 
sections for the sake of a more thorough study. An overview of 
scientific output over time is provided, as well as a breakdown 
by topic area and publication venue of the most widely cited 
publications, institutes, and authors. Furthermore, we examine 
the content findings to identify the most important trends in 
the growth of the retail sector. 

The impact of cutting-edge technologies on XSS attack 
detection is a significant field of study. Details of our database 
are represented in Table I. As represented in Table I, our 
dataset includes 184 articles from 2009 to 2022, these 184 
articles have 5.902 citations per document. Furthermore, there 
are more than 477 unique authors who published papers during 
this time; therefore, we can say that XSS attack detection 
through cutting edge technologies is an interesting topic be- 
cause many authors are working on it. From Figure 1, most 
of the computer science researchers work in the respective 
research field. 


TABLE I: Overview of Dataset 


















































Parameter Details 
Time- Period 2009:2022 
Sources 123 
Papers 184 
References 4329 
article 44 
book chapter 2 
conference paper 113 
conference review 23 
review 2 
Keywords Plus (ID) 997 
Authors 477 
Single-authored documents 29 
Collaboration Index 3.03 
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(b) Type of source distribution 
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(c) Subject distribution 


Fig. 1: Database specifications 


A. Distribution of Source 


In this subsection, we give an analysis of the publica- 
tion sources. To represent the productivity and impact of 
sources, we use the number of citations, the number of 
documents published, the h index, the g index, and the m index 
as comparative variables. Therefore, the top 10 productive 
sources are represented in Table II. From Table III it is 
clear that the most productive source is PROCEEDINGS - 
INTERNATIONAL SYMPOSIUM ON SOFTWARE RELIABIL- 
ITY ENGINEERING, ISSRE_ with the highest number of 
citations. The other most popular and quantity-based jour- 
nals are as follows: PROCEEDINGS - INTERNATIONAL 
SYMPOSIUM ON SOFTWARE RELIABILITY ENGINEER- 
ING, ISSRE, IEEE ACCESS, ACM INTERNATIONAL CON- 
FERENCE PROCEEDING SERIES, IEEE TRANSACTIONS 
ON DEPENDABLE AND SECURE COMPUTING, LEC- 
TURE NOTES IN COMPUTER SCIENCE (INCLUDING SUB- 


SERIES LECTURE NOTES IN ARTIFICIAL INTELLIGENCE 
AND LECTURE NOTES IN BIOINFORMATICS), ADVANCES 
IN INTELLIGENT SYSTEMS AND COMPUTING, INFOR- 
MATION AND SOFTWARE TECHNOLOGY, PROCEEDINGS 
- 2011 INTERNATIONAL CONFERENCE ON NETWORK- 
BASED INFORMATION SYSTEMS, NBIS 2011 JOURNAL OF 
INFORMATION PROCESSING SYSTEMS, PROCEEDINGS - 
IEEE SYMPOSIUM ON COMPUTERS AND COMMUNICA- 
TIONS. 


1) Source Ranking according to Bradford law: One of the 
most significant bibliometric laws is Bradford’s law. When 
there is an increase in the number of ’subject” papers, there 
must be an increase in the number of ’journals/information 
sources,” according to Bradford’s rule. As the Bradford multi- 
plier increases, so does the number of groups of journals that 
must be involved in order for almost equal numbers of papers 
to be published in each. Consequently, if the area of study is 


TABLE II: Local Source Impact Details 





Element H Index 


G Index M Index TC NP Paper 


Year 





PROCEEDINGS - INTERNATIONAL SYMPO- | 1 
SIUM ON SOFTWARE RELIABILITY ENGI- 
NEERING, ISSRE 


1 0.07692307/7 82 1 2010 





IEEE ACCESS 


Ww 


Ww 


0.75 75 3 2019 





ACM INTERNATIONAL CONFERENCE PRO- | 4 
CEEDING SERIES 


6 0.444444444 73 6 2014 





IEEE TRANSACTIONS ON 
AND SECURE COMPUTING 


DEPENDABLE | 1 


1 0.125 66 1 2015 





LECTURE NOTES IN COMPUTER SCIENCE | 6 
(INCLUDING SUBSERIES LECTURE NOTES 
IN ARTIFICIAL INTELLIGENCE AND LEC- 
TURE NOTES IN BIOINFORMATICS) 





7 0.545454545 63 9 2012 





ADVANCES IN INTELLIGENT SYSTEMS | 5 
AND COMPUTING 


6 0.5 52 6 2013 





INFORMATION AND SOFTWARE TECH- | 1 
NOLOGY 


1 0.1 49 1 2013 





PROCEEDINGS - 2011 INTERNATIONAL | 1 
CONFERENCE ON NETWORK-BASED IN- 
FORMATION SYSTEMS, NBIS 2011 


1 0.08333333B 48 1 2011 





JOURNAL OF INFORMATION PROCESSING | 1 
SYSTEMS 


1 0.16666666/7 46 1 2017 





PROCEEDINGS - IEEE SYMPOSIUM ON | 1 
COMPUTERS AND COMMUNICATIONS 


1 0.09090909]1 42 1 2012 





AD HOC NETWORKS 1 


1 0.25 40 1 2019 





PROCEEDINGS OF 2011 3RD INTERNA- | 1 
TIONAL CONFERENCE ON AWARENESS 
SCIENCE AND TECHNOLOGY, ICAST 2011 


1 0.08333333B 37 1 2011 





COMPUTER NETWORKS 1 


1 0.33333333B 34 1 2020 





PROCEEDINGS - 2015 INTERNATIONAL | 1 
CONFERENCE ON CYBER-ENABLED DIS- 
TRIBUTED COMPUTING AND KNOWL- 
EDGE DISCOVERY, CYBERC 2015 


1 0.125 31 1 2015 








PROCEEDINGS - 16TH IEEE INTERNA- | 1 
TIONAL CONFERENCE ON HIGH PERFOR- 
MANCE COMPUTING AND COMMUNICA- 
TIONS, HPCC 2014, 11TH IEEE INTER- 
NATIONAL CONFERENCE ON EMBEDDED 
SOFTWARE AND SYSTEMS, ICESS 2014 
AND 6TH INTERNATIONAL SYMPOSIUM 
ON CYBERSPACE SAFETY AND SECURITY, 
CSS 2014 


1 O.L1L111111 23 1 2014 





PROCEEDINGS OF THE 2015 12TH INTER- | 1 
NATIONAL JOINT CONFERENCE ON COM- 
PUTER SCIENCE AND SOFTWARE ENGI- 
NEERING, JCSSE 2015 


1 0.125 23 1 2015 





PROCEEDINGS - 2016 IEEE SYMPOSIUM ON | 1 
SECURITY AND PRIVACY, SP 2016 


1 0.14285714B 20 1 2016 





PROCEEDINGS - 2017 IEEE/ACM 39TH IN- | 1 
TERNATIONAL CONFERENCE ON SOFT- 
WARE ENGINEERING, ICSE 2017 


1 0.16666666/7 18 1 2017 





APPLIED SCIENCES (SWITZERLAND) 2 


3 0.66666666/7 17 2020 





JOURNAL OF INTERNET SERVICES AND | 1 
APPLICATIONS 











1 0.25 17 1 2019 




















confined, only a small number of journals will be required to 
provide the core of the work. The number of journals necessary 
to generate the number of publications grows rapidly beyond 
the nucleus or first zone. For example, if the next 300 articles 
are to be supplied by two journals, a total of sixteen journals 
are required to do so. As represented in Figure 2d the most 
important and valuable sources are represented in zone | of 
the Bradford figure. Hence, from the above discussion, we can 
say that LECTURE NOTES IN COMPUTER SCIENCE (IN- 
CLUDING SUBSERIES LECTURE NOTES IN ARTIFICIAL 
INTELLIGENCE AND LECTURE NOTES IN BIOINFOR- 
MATICS), ADVANCES IN INTELLIGENT SYSTEMS AND 


COMPUTING, COMMUNICATIONS IN COMPUTER AND 
INFORMATION SCIENCE, ACM INTERNATIONAL CON- 
FERENCE PROCEEDING SERIES, JOURNAL OF PHYSICS: 
CONFERENCE SERIES, APPLIED SCIENCES (SWITZER- 
LAND), IEEE ACCESS 2020 4TH INTERNATIONAL CON- 
FERENCE ON ELECTRONICS, MATERIALS ENGINEERING 
AND NANO-TECHNOLOGY, IEMENTECH 2020 are some of 
the leading sources that are publishing the quality of research 
papers in the field of retail sector. 


B. Authors and Country Distribution 


In this subsection, we give statistical details about the au- 
thors who are actively working to study the impact of cutting- 
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Fig. 2: Source Distribution 


edge technologies on XSS attack detection. There are many 
ways through which we can find the most productive authors in 
the respective field. One method is the classification of authors 
by the number of citations. The Figure 3 represents the statics 
of the author. In Figure 3a the authors are represented through 
the frequency of article factorization, and in Figure 3b the 
authors most cited are represented. Therefore, from Figure 3a 
and Figure 3b it is clear that CUI B, HOWE JM HUANG C, 
MEREANI FA, CHAUDHARY P, FANG Y, LI Y, SHAR LK, 
ZHOU Y, ABDULLAH J are the authors who actively work 
in the field. 


Apart from the number of citations, frequency and reverence 
are also the variables via which the most renowned authors 
in a particular subject may be determined. This analysis is 
presented in Figure 3c and ??. From Figure 3c it is clear that 
as the year goes on, more and more research is interested 
in the respective research field. In 2020, only five researchers 
are working in the field, but in 2021, more than 10 researchers 
started their research in the respective domain. This also shows 
that this research topic is still developing and there is a scope 
of research in this domain. Finally, ?? represents the work area 
of the authors, and this figure is constructed on the principles 
of Sankey diagrams. From ?? it is clear that Fleischmann 
D and Gopalkrishna P work in the most diverse field. The 
research fields of the leading researchers are represented in 
Pe 

The distribution of researchers by nations is also a signifi- 
cant and beneficial component of the biblomatrix. This metric 
indicates the effectiveness of a country’s researchers. Figure 4 
represents the distribution of countries according to the total 
number of publications in paper and the corresponding authors. 


TABLE II: Source distribution 

































































Element h_index | g_index | m_index | TC NP 
SHAR LK 3 3 0.3 133 | 3 
TAN HBK 2 2 0.2 115 | 2 
NEUHAUS S 1 1 0.077 82 1 
ZIMMERMANN T | 1 1 0.077 82 1 
BRIAND LC 1 1 0.125 66 1 
CUI B 2 2 0.5 51 2 
CHOI C 1 1 0.083 48 1 
CHOI J 1 1 0.083 48 1 
KIM H 1 1 0.083 48 1 
KIM P 1 1 0.083 48 1 
PARK JH 1 1 0.167 46 1 
RATHORE S 1 1 0.167 46 1 
SHARMA PK 1 1 0.167 46 1 
HUANG C 3 4 0.6 45 4 
DOS SANTOS EM | 1 1 0.091 42 1 
FEITOSA E 1 1 0.091 42 1 
NUNAN AE 1 1 0.091 42 1 
SOUTO E 1 1 0.091 42 1 
YANG W 1 1 0.25 41 1 
ZUO W 1 1 0.25 41 1 


























Figure 4 represents the top countries according to the paper 
publication frequency; according to the Figure 4 top ten 
countries with the most published articles, Figure 4 are: 


© CHINA (190) 
ITALY (94) 

« KOREA (94) 

« LUXEMBOURG (66) 
« SINGAPORE (49) 

° USA (46) 

¢ BRAZIL (42) 

« JAPAN (37) 


&& VOSviewer 
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Fig. 3: Authors Stastics 


« UNITED KINGDOM (31) 
« INDIA (21) 


Therefore, from ??, we can say that Indian researchers 
have been actively working in the field of the development 
of concepts for the retail sector for the past two years. The 
next important factor is the collaboration among the authors 
from different countries, which represents the productivity of 
a country. ?? represents the distribution of the corresponding 
authors and the nature of the paper (ie, single author (SCP) 
or multiauthor (MCP) paper). From the ?? it is clear that: 

e The collaboration rates of authors from China, India, 
Iran, Italy, Korea, Poland, Norway, and the UK do not 
collaborating much. 

e The 33% authors from the US are collaborating with the 
authors from other countries. 

e More than 50% authors from Malaysia and Portugal are 
collaborating with other country authors. 

e Finally, the authors of Finland and Ireland are working 


Country Scientific Production 





Fig. 4: Countries’ Scientific Production 


100% with other authors from countries. 


C. Documents Distribution 


In this subsection, we give details about the scientific 
distribution of the research papers. In the Scopus database, 
there are 184 articles related to our study. However, not 
all the published papers are important and provide valuable 
information about the subject area. Therefore, to obtain the 
information of the article, we find highly cited articles related 
to the development of XSS attack detection techniques. The 
details of this type of papers are presented in Figure 5 and 
Table IV. In Figure 5, the documents are represented according 
to the citation. Therefore, as the citation of the paper increases, 
it becomes darker. Similarly, Table IV arrange the papers 
according to total citations, average citation, and normalized 
citation. 


IV. CONCLUSION 


Using a bibliometric analysis, this study aims to assess 
the current state of the retail sector in light of the COVID- 
19 revisions, identify relevant problems and suggest future 
research challenges. As a result, our research has added to the 
body of knowledge in both the retail industry and the academic 
community. 

In this review a large number of articles are included 
(96 in total, between 2009 and 2022). The progress of the 
retail sector under COVID-19 is thoroughly documented in a 
wide range of academic disciplines. E-Commerce and supply 
chain management, consumer behavior analysis, and AI-based 
decision making are just a few examples of research related to 
retail currently being conducted. Although this topic has only 
been around for around two years, scholars from a wide range 
of fields are taking an interest in it. Aside from geographical 
and intellectual diversity, this interest in retail is also visible 
in the contributions made by nations and institutions ( the 
United States, China and India) of diverse origin. Furthermore, 
the number of papers and citations has grown exponentially 
in the last two years, leading us to believe that retail re- 
search is a growing trend. The rise in interest in retail is 
attributed to the introduction and implementation of cutting- 
edge technologies such as AI and ML in this sector. The 
most productive researchers actively working in this research 
field are: FLEISCHMANN D, GOPALAKRISHNA P, LOPES 
M, ABBU HR, and ABDULLAH NS. Based on co-occurrence 
analysis and the terms writers use to define their work, we 
have drawn together a pair of strategic diagrams that reveal 
both previously studied subjects and new research trends. 
The analysis shows that the most relevant themes studied 
in the current literature are retail stores, service industry, e- 
Commerce, supply chain management, sustainable develop- 
ment, behavioral analysis, and empirical analysis. Therefore, 
more frameworks and algorithms are needed to solve the issues 
related to these themes. 
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